A Text Categorization Approach to Video Scene Classification Using Keypoint Features

نویسندگان

  • Jun Yang
  • Alexander G. Hauptmann
چکیده

Scene classification based on local keypoint features has emerged as a promising research direction. Each image is represented by a “bag of visual words” as high-dimensional, vector-quantized keypoint features, which is analogous to the “bag of words” representation of text documents. Based on such representation, we take a fully text categorization approach to the scene classification problem, with an emphasis on comparatively evaluating various implementation choices related to this visual-word representation, including vocabulary size, feature weighting and normalization, feature selection, etc. We intend to practical insights as to the optimal representation choices that maximize the classification performance. Scene classification experiments based on a large video corpus lead to many important observations: 1) using our approach, carefully engineered visual-word features achieve comparable performance to that of traditional color/texture features, and their combination significantly enhances the performance; 2) the visual word distribution in the video corpus bears many similarities yet important differences to the word distribution in a text corpus; 3) a vocabulary much larger than the ones currently used is preferred; 4) frequent visual words are not “stop words” but more informative than rare words; 5) binary features are as effective as tf or tf-idf features, and normalization always hurts the classification performance; 6) feature selection can reduce the vocabulary size by half without loss of performance; 7) spatial information is much more useful with a small vocabulary than with a large one.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving the Operation of Text Categorization Systems with Selecting Proper Features Based on PSO-LA

With the explosive growth in amount of information, it is highly required to utilize tools and methods in order to search, filter and manage resources. One of the major problems in text classification relates to the high dimensional feature spaces. Therefore, the main goal of text classification is to reduce the dimensionality of features space. There are many feature selection methods. However...

متن کامل

Fine-grained Visual Categorization using PAIRS: Pose and Appearance Integration for Recognizing Subcategories

In Fine-grained Visual Categorization (FGVC), the differences between similar categories are often highly localized to a small number of object parts (see Figure 1), and significant pose variation therefore constitutes a great challenge for identification. To address this, we propose extracting image patches using pairs of predicted keypoint locations as anchor points. The benefits of this appr...

متن کامل

An Image Based Approach for Content Analysis in Document Collections

We consider the task of content based analysis and categorization in large-scale historical book scanning projects. Mixed content, deprecated language, noise and unexpected distortions suggest an image based approach. The use of keypoint extractors combined with the bag of features approach is applied to scanned text documents. In order to incorporate spatial information into the bag of feature...

متن کامل

An Improved Motion Vector Estimation Approach for Video Error Concealment Based on the Video Scene Analysis

In order to enhance the accuracy of the motion vector (MV) estimation and also reduce the error propagation issue during the estimation, in this paper, a new adaptive error concealment (EC) approach is proposed based on the information extracted from the video scene. In this regard, the motion information of the video scene around the degraded MB is first analyzed to estimate the motion type of...

متن کامل

A New Approach for Text Documents Classification with Invasive Weed Optimization and Naive Bayes Classifier

With the fast increase of the documents, using Text Document Classification (TDC) methods has become a crucial matter. This paper presented a hybrid model of Invasive Weed Optimization (IWO) and Naive Bayes (NB) classifier (IWO-NB) for Feature Selection (FS) in order to reduce the big size of features space in TDC. TDC includes different actions such as text processing, feature extraction, form...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006